Continent-island boundary and environment-shaped evolution in the marine amphipod Ampithoe marcuzzii complex (Crustacea: Eumalacostraca: Ampithoidae)

Marine amphipods are crustaceans that lack a larval phase and consequently have low dispersion rates. Despite that, these crustaceans present a remarkable ability to be transported by rafting on natural floating substrata, especially macroalgae, where they find shelter, food and a mating ground. The species Ampithoe marcuzzii is widely distributed throughout the western Atlantic Ocean. Here, it was used as a model to study seascape genomics and phylogeography in invertebrates with low dispersion capacities. We anticipated that the lineages would present isolation-by-distance patterns. However, surface currents and other abiotic variables could facilitate connectivity among distant sites. Based on mitochondrial and nuclear genes, SNPs, and environmental associations, we observed the presence of a species complex within A. marcuzzii, separating mainland and insular populations. Each species showed an independent evolutionary history, with a strong latitudinal population structure and evidence of isolation-by-distance and isolation-by-environment, characterizing the 'continent' species. Historical expansion and environmental variables were observed associated with the southeastern population, and ecological niche modeling corroborated the region as a paleorefuge. Conversely, populations from 'islands' presented complicated evolutionary histories, with closer localities genetically isolated and distant localities connected. These findings indicate that insular populations with low dispersion capacity might be more susceptible to spatial connectivity by floating substrata and to changes in surface currents. In contrast, mainland populations might be more vulnerable to local climate changes due to lack of gene flow.


SNPs prospection and data analysis
The sequenced libraries ranged from 312,772 to 12,261,962 reads per sample.After filtering and clustering steps, the average consensus read number was 26,284 (Supplementary Table S1).One 'merged' dataset was used for Ampithoe marcuzzii s.l., in which 5,857 SNPs were recovered from 134 specimens, with 31.19% of total missing data (Supplementary Fig. S1).Two posterior datasets were built, with Saint Martin (SM) specimen removal and separation of the ' continent' and 'islands' lineages (defined in Table 1) for comparative population analyses, resulting in 21,126 SNPs, 72 specimens and 27.55% of total missing data, and 17,237 SNPs, 55 specimens and 16.09% of total missing data, respectively (Supplementary Figs.S2 and S3).

Phylogenetic analyses and divergence time estimates
mtDNA and 28SRNA Three distinct molecular operational taxonomic units (MOTUs) were identified by the phylogenetic tree and species delimitation analysis (Fig. 1a and Supplementary Fig. S5), showing a clear distinction between A. marcuzzii s.l.from the ' continent' , 'islands' , and SM.A ' continent' clade was recovered, with northeastern Brazil (NE) being a sister clade of eastern Brazil (E) and southeastern Brazil (SE) being the sister clade of E + NE.As shown in Fig. 1a, the multilocus tree of mtDNA + 28SrRNA was used to infer the divergence times within Ampithoe marcuzzii s.l.Specimens from the NE localities of Caucaia (CAU), Flecheiras (FLE) and Paracuru (PCR) clustered within NE, while Trancoso (TRA), in the eastern mainland, belonged to E. 'Islands' clade did not show islands' structure patterns, clustering all the oceanic islands (OI) together, while separated from the Dominican clades of Caribbean Dominica (CDM) and San Sauveur (SSV) (Fig. 1).

SNPs
Unrooted phylogenomic reconstruction agreed with the spatial distribution of A. marcuzzii s.l., with the separation of the main clades into SE and NE continent, and one clade grouping the Brazilian OIs and the Dominican sites in the Lesser Antilles (LA, Fig. 1b).Moreover, phylogenomic analyses also revealed locality monophyly for several sampling sites, including one clade from the estuarine island of Boipeba (BOI), Praia do Forte (FOR) in the NE continent, Cabo Frio (CF) and Ubatuba (SPD) in the SE continent, the OIs of Trindade (TR) and Rocas Atoll (RA), CDM and SSV (Fig. 1b).
A highly supported clade named 'islands' was recovered, comprising a clade from the tropical oceanic island TR and another from the equatorial island RA.On the other hand, Fernando de Noronha (FN) was recovered as paraphyletic and including TR and RA clades (Fig. 1b), clustered together the three Brazilian OIs.The OIs are a sister clade of CDM.SSV is the only locality sampled in Atlantic Dominica and is the sister clade of OI + CDM, while SM is the sister clade of all remaining 'islands' specimens.

Genetic diversity and population structure
Because of the low sampling of SM specimens, further analyses were performed for ' continent' and 'islands' only.Genetic diversity indexes slightly differed between the ' continent' and 'islands' , with overall haplotype diversity (H) similar for the ' continent' and 'islands' , and nucleotide diversity (θ π ) between individuals being lower for ' continent' for COI (θ π = 1.50) and SNPs (θ π = 118.60),and higher for 16SrRNA (θ π = 12.33) and 28SrRNA (θ π = 0.12) (Supplementary Table S2).Overall θ π values for the 'islands' varied from 0.00 for 28SrRNA to 260.24 for SNPs.Within the ' continent' , greater diversity varied among markers, with values of 8.00 in Flecheiras (FLE) (NE) for COI, 57.60 in CF (SE) for 16SrRNA, 0.536 in Cabo de Santo Agostinho (CSA, NE) for 28SrRNA and 194.00 and 187.48 for SNPs in FOR (NE) and Abrolhos (ABR, E), respectively.Regarding the 'islands' clade, the highest values of θ π also varied among localities, with values of 14.55 in CDM for COI, 31.72 in FN for 16SrRNA and 409.64 in RA for SNPs, while θ π values were null for 28SrRNA (Supplementary Table S2).F IS values across the loci were only significant for two localities in the 'island' dataset (RA = 0.093 and SSV = 0.382, p < 0.05, Supplementary Table S2).
Using the genetic clusters E, NE and SE for ' continent' , and OI, CDM and SSV for 'islands' to perform a hierarchical AMOVA, the ' continent' results of all markers suggested that most of the total variance is retained among groups, with high significant Φ ST and Φ CT indices for all genes and the SNPs (Table 2).The 'islands' results recovered most of the total variance among groups for COI and 28SrRNA, while 16SrRNA and SNPs showed most of the variance within populations (Table 2).
The pairwise F ST suggests augmented differentiation between localities from different regions (i.e., NE, E and SE) in the ' continent' clade for all markers (Supplementary Fig. S5).Moreover, within the 'islands' clade, the highest values of F ST were observed between SSV and TR for all markers (16SrRNA F ST = 0.68; 28SrRNA F ST = 1.00;COI F ST = 0.69; Supplementary Fig. S4, Supplementary Table S3).The most conserved marker used, the nuclear 28SrRNA, presented higher pairwise F ST values (28SrRNA F ST = 1.00).In contrast, most variable SNPs recovered less differentiation between populations.Mantel tests with the SNP dataset for IBD were significant for the ' continent' (Mantel r 2 = 0.71, p = 0.002), whereas no evidence of IBD was observed for the 'islands' (Mantel r 2 = 0.06, p = 0.334).

mtDNA and 28SrRNA
The haplotype networks of COI, 16SrRNA and 28SrRNA presented a clear geographic pattern with two sets of haplotypes separated by over 100 mutational steps in the 28SrRNA network (Fig. 2).These two main sets of haplotypes are consistent with the ' continent' and 'islands' clusters.In all networks, specimens from NE and SE  localities are generally characterized by the presence of private haplotypes, with one shared haplotype between localities of the NE region in all network (Fig. 2).The population structure from region E in the continent diverged among networks, with unique haplotypes from ABR and BOI for COI and 16SrRNA and shared haplotypes between ABR and BOI for 28SrRNA (Fig. 2).The 'islands' clade, on the other hand, presented different network shapes, mainly constituted of private COI and 28SrRNA haplotypes, and a primary shared 16SrRNA haplotype (Fig. 2).Concerning COI, specimens from OI localities shared haplotypes and specimens from the LA presented private haplotypes.In turn, 16SrRNA had a primary haplotype shared by seven localities, including the OI and LA sites (Fig. 2).

SNPs
PCA and DAPC scatter plots of the SNPs showed a symmetric spatial distribution of clusters NE, E and SE within ' continent' for the first two PC axes (Fig. 3a and Supplementary Fig. S6), with the main separation of NE and E from SE in the first axis, and the separation of E from NE and SE on the second axis (Fig. 3a and Supplementary Fig. S6).Within the 'islands' , the PCA's first axis separated the three genetic clusters of the OIs, CDM and SSV.
In comparison, the second axis mostly separated CDM from the two remaining genetic clusters (Fig. 3a).Lastly, the DAPC analysis suggested four clusters, including two from SSV (Supplementary Fig. S7).The genotypic structure attribution recovered with SNPs resulted in two genetic clusters for the entire Ampithoe marcuzzii s.l.dataset, in strong agreement with the two proposed clades of ' continent' and 'islands' (Fig. 3b and Supplementary Fig. S8), except for one single individual (SSVS5F8) from the Atlantic Dominican locality of SSV that presented 28% of ' continent' cluster identity (Fig. 3b).A second Structure analysis with separated datasets into ' continent' and 'islands' recovered two and three genetic clusters, respectively (Fig. 3d, e).The ' continent' analyses revealed two Ks with the highest ΔK values: Structure analyses with K = 2 suggested complete separation of populations from NE + E and SE (Fig. 3d), whereas K = 3 separated E from NE (Supplementary Figs.S9 and S10).The 'islands' analyses were also geographically congruent, separating individuals into three genetic clusters corresponding to OIs, CDM and SSV samples (Fig. 3e and Supplementary Fig. S11).

Demographic history and divergence times mtDNA and 28SrRNA
The investigation of mtDNA demographic history was obtained from Tajima's D and Fu's Fs analyses.The results did not detect significant evidence of population expansion for most lineages, except for the OIs (D = −2.69,p < 0.005, Supplementary Table S4).The estimates of variation in population size over evolutionary time based on the Bayesian skyline plots suggested that lineages' population sizes have remained constant for long period before declining over their recent evolutionary past (Supplementary Figs.S12-S15).

SNPs
For SNPs, demographic history was assessed by Fastsimcoal with three different expansion scenarios for each dataset.As a result, the most likely expansion scenario within the ' continent' occurred on SE toward E and NE regions around 601,900 generations ago with greater gene flow from the SE to NE (Fig. 4).In the 'islands' clade, the most likely scenario was the expansion of the Caribbean SSV population to the CDM and OIs, around 373,749 generations ago and migration rates showed a more robust gene flow from SSV to the OIs (Fig. 4).The historical migration between localities also inferred by Fastsimcoal resulted in highest probabilities of migrants by SINO (SE) as source and NAT (NE) as the receiver in the ' continent' , and between the equatorial islands, with RA as the source and FN as the receiver in the 'islands' (Fig. 3d, Supplementary Figs.S16 and S17).The other most likely routes of migration were from CF (SE) to BOI (E), from CDM to FN, and from FN to TR (Fig. 3e, Supplementary Fig. S17).

Climate association analyses
Ecological niche models' (ENMs) suitability maps were built based on the ' continent' and 'islands' distribution of specimens and variables were selected a priori based on factorial analysis: distance to shore (biogeo5), concavity (biogeo7), sea surface salinity (SSS) of the saltiest month (biogeo10), annual range in SSS (biogeo11), sea surface temperature (SST) of the coldest month (biogeo14), and annual range in SST (biogeo16), as well as bathymetry because of its biological importance for benthic macroalgae distribution (Supplementary Table S5).ENM results confirmed the clade ' continent' distribution suitability in the South American continent, including greater suitability on shallow waters of the continental shelf, mainly in the Brazilian SE and NE regions (Fig. 5).A small patch of adequate climate for ' continent' A. marcuzzii is also observed on Venezuela's coastline (Fig. 5).Suitability for 'islands' clade, on the other hand, is much lower, with a thin, suitable line on the margin of the continental shelf and islands of the FN seamount chain (Fig. 5).In the Caribbean, adequate sites include the islands of the LA (Fig. 5).Paleodistribution models suggested a considerable geographical expansion from the LGM to the present time for the ' continent' clade, with a refuge in southeastern Brazil, with higher suitability in the SE region along the margin of the continental shelf.In contrast, the paleodistribution of 'islands' revealed higher suitability in the OI, including the Vitoria-Trindade seamount chain (Fig. 5), showing a pattern of distribution retraction from the LGM to modern days.In the Caribbean, the less shallow areas in the LGM increased suitability from the past model to the present (Fig. 5).

SNPs
The association of environmental variables with the structure values (F ST ) and geographic distance tested by the partial Mantel test for isolation-by-environment (IBE) indicated that more climatic factors are related to the ' continent' clade variation, including seafloor structure, oxygen concentration, nitrate concentration, salinity and temperature (Supplementary Table S6).On the other hand, changes in the 'islands' clade significantly varied with sea surface temperature and carbon phytoplankton biomass only (Supplementary Table S6).
The RDA recovered ten environmental variables with low correlation, with four variables common to the ' continent' and 'islands' (Supplementary Figs.S18 and S19): bathymetric slope, concavity, north/south aspect and plan curvature, all related to seafloor structure (Supplementary Table S6).Both analyses also recovered variables related to dissolved oxygen concentration, temperature and sea surface salinity, whereas chlorophyll and nitrate concentration were only found for the ' continent' , and carbon phytoplankton biomass was only recovered for 'islands' .The latter two variables agree with IBE association results (Supplementary Table S6).Within the ' continent' , genomic variation in SE populations of CF, SINO and SPD are explained by changes in dissolved oxygen www.nature.com/scientificreports/and nitrate concentration (Fig. 6).Within the 'islands' , individuals from SSV showed tremendous variation along the first axis (Fig. 6).Both Caribbean populations of SSV and CDM appear to be associated with minimum dissolved oxygen concentration, separated from each other on the second axis, while Brazilian populations from the Brazilian OIs are related to changes in maximum dissolved oxygen concentration, carbon phytoplankton biomass, sea surface salinity, annual temperature range and seafloor related variables (Fig. 6).

Discussion
Regardless of the genome-wide and large-scale sampling, the present study only partially explain the complex evolutionary and biogeographic history of A. marcuzzii s.l.Two MOTUs were found widespread in the Atlantic: ' continent' and 'islands' .While the ' continent' lineage presented a clear population structure, the 'islands' presented a much more intriguing pattern, lacking IBD and with closer, highly structured populations.Despite rafting being a powerful mechanism for fauna dispersion, especially peracarid crustaceans, A. marcuzzii s.l. was only www.nature.com/scientificreports/found associated with benthic macroalgae, being not found associated with pelagic Sargassum patches.Therefore, results of isolation and connectivity observed nowadays suggest complex evolutionary and biogeographic events possibly related to other traits not analyzed herein.The genetic population structure of Ampithoe marcuzzii s.l.unveiled a complex evolutionary history shaped by the environment.Its direct development plays a crucial role in the lineages diversification, demonstrated by the phylogenetic, phylogenomic and structure results that characterized the separation between the South American continent (' continent') and the clade that encompasses the Lesser Antilles and Brazilian OI ('islands').Despite not possessing A. marcuzzii s.l.samples from the North Atlantic coast, sampling from the Caribbean and South Atlantic demonstrated that lineages are not shaped by geographical isolation only.Thus, further sampling in the Caribbean region is essential to understand better the biogeographical processes that A. marcuzzii s.l. were subjected.Here, we considered A. marcuzzii s.l. as a species complex, and each MOTU (' continent' , 'islands' and SM) was treated separately for population analyses purposes.
These patterns of diversification between continent and oceanic islands were previously observed for other peracarid species 16,19,20 , where 19 found high levels of intraspecific divergence and endemism for Hyalidae amphipods in the Macaronesia islands using COI sequences.Similar results were found for the three most diverse orders of Peracarida (Amphipoda, Isopoda and Tanaidacea) in the region 16,20 .According to 15 , the extinction and speciation rates have a stronger role as drivers of evolution in islands than the immigration rate for species with low dispersion capacity, where dispersion might have a larger impact in the first stages of the islands' formation.
In the western Atlantic, pieces of evidence of the ' continent' and 'islands' separation were previously observed in the Brazilian waters for the coral Mussismilia hispida (Verrill, 1901) 22 , showing that fauna associated with benthic macroalgae, especially the ones with low dispersion capacity, may reflect similar patterns to sessile invertebrates 22 .In agreement with M. hispida results, A. marcuzzii s.l. also presented high gene flow between TR and EI, despite being separated by over 1800 km.On the other hand, there was a lack of gene flow between EI and geographically closer continent (NAT).This scenario could be because surface currents observed nowadays do not reflect the lineages' ancient divergences or the ancient currents.Instead, we observed that other environmental variables might shape the distribution of the ' continent' and 'islands' .While several climatic features are associated with the genomic structure in the ' continent' , only two variables related to the sea surface were associated with the 'islands' .Thus, a combination of abiotic and biotic acting factors might have led the 'island' clade to persevere in the harshest environments.One could hypothesize that salinity is critical for clade diversification since the OIs are mainly deprived of freshwater sources.However, sea surface salinity was not identified as a potential genetic structure driver.Despite the 'islands' clade being in the OIs, this result is probably because the same lineage is abundant in the Caribbean region, which is rich in freshwater sources and has a strong seasonal rainfall.
Indeed, the 'islands' lineage showed greater genomic diversity within most sites than the ' continent' one, possibly due to a greater ability to adapt genetically to environmental change 23 .The resilience and competitive capacity of the 'islands'' A. marcuzzii might be distinctive, considering that it is the only species observed in all sampled islands so far (T.Iwasa-Arai and S. Siqueira, pers.obs.).
After the 'continent' and 'islands' separation, clades presented independent evolutionary histories, with speciation processes and environmental factors that drove each lineage to a different scenario.The population structure also reflects the direct development of amphipods, observed within populations NE, E and SE for the 'continent' and OI, CDM and SSV for the 'islands' , whose strong population structure is congruent with the IBD in the 'continent' and the monophyly of most localities.
A recent assessment of the Brazilian marine biogeography using a multi-taxa approach suggested five geographic bioregions (Abrolhos, N/NE, NE, SE/S and S) along the continent considering species distributions, mainly associated with sea surface temperature, water turbidity and current velocity 24 .The genetic structure observed in the 'continent' A. marcuzzii was partially congruent to three bioregions, as the MOTU was not found in the southern region and the northern region was not sampled.The assemblages analyzed by 24 clustered the state of Bahia as a sub-bioregion within the northeast bioregion, while Abrolhos was recovered as an independent bioregion.In contrast, our analyses suggested ABR grouped with other Bahia localities.
While NE localities presented low mean genetic diversity in all markers, populations from E and SE showed higher values of population diversity.This result of high diversity within SE is congruent in most analyses, including the divergence time estimates with the first split of the SE population, the migration exchange from SE to both E and NE sites and by the estimates of historical expansion of SE around 601,900 generations ago.Therefore, SE had a larger suitable area during the LGM (Fig. 5), allowing geographic and genetic diversification.In contrast, populations from the E and NE were highly affected by the lower sea levels, a pattern observed for other marine invertebrates such as marine gastropod Littoraria flava 25 .
The congruence of higher diversity in higher latitudes is an unusual finding since most of the marine species, from macroalgae to corals and crustaceans, presented lower levels of genetic diversity in the southernmost regions when assessed by microsatellites and mitochondrial markers 22,26 .However, higher diversity found in southernmost sites was also observed for L. flava in a genome-wide study 25 , indicating that evolution evidenced by SNPs reflects more recent timescales.
Barriers to gene flow across the continent are detailed and discussed in the recent review by 27 , where the bifurcation of the offshore South Equatorial Current around Cape São Roque (NE) is the most frequent barrier with phylogeographic concordance.Other representative barriers included the Vitória-Trindade seamount chain (SE), the Cabo Frio upwelling system (SE), and the main river mouths of São Francisco (NE), Paraguaçu (NE), Jequitinhonha (E), Doce (E), and Paraíba do Sul (SE 27 ).Among these barriers, two barriers related to gene flow are observed as shaping the main lineages within the 'continent' .The first separated the SE from NE and E populations and are concordant with the Vitória-Trindade seamount chain.Hence, our data suggest that the most likely scenario was vicariance driven, with the presence of a panmictic population through the continent, www.nature.com/scientificreports/followed by the posterior barrier emergence in the Quaternary 28 .In addition, migration and ENM results further demonstrate the SE region as a refuge during glacial periods, with posterior migration and niche expansion in recent times, consistent with the pattern found in other marine coastal invertebrates (e.g. 25 ).
Despite the historical sea level dynamics in which the seamount chain acted as a barrier, other current environmental variables may maintain the isolation in this region.The IBE and RDA results showed that variables related to the seafloor structure play an essential role in the NE, E and SE populations structure (Supplementary Table 5).In contrast, seawater composition shaped the separation of SE populations (Supplementary Table S6, Fig. 6).In the RDA, we observed that the SE localities are directly associated with changes in dissolved oxygen and nitrate concentration (Fig. 6).Both these variables are higher in sites with robust water dynamism.Cold upwelling from the South Atlantic Central Water in this region could be vital in isolating the SE lineage 22 .The second barrier found to delimit the populations is the Paraguaçu river mouth in the Todos os Santos Bay (BA).Our study succeeded in sampling sites close to each side of the barrier (i.e., FOR and BOI), which do not share haplotypes and have significant pairwise genetic structuring (Fig. 3).
Interestingly, ABR, the furthest continental island, showed high genetic diversity and high connectivity with the Eastern populations in the continent (BOI, TRA; Supplementary Table S2, Fig. 3d).The Abrolhos Archipelago is the largest and most diverse reef complex of the South Atlantic 29 .Its species composition and abiotic characteristics suggest a unique environment that differs from all other continental Brazilian marine provinces 24 .Despite its singularity, the genetic connectivity among ABR and nearby localities in the continent for other taxa, such as the coral M. hispida 22 and endosymbiotic dinoflagellates associated with Favia gravida 30 , suggests that ABR might work as a genetic diversity reservoir in the region.
The second lineage is distributed throughout the Brazilian OIs of RA, FN and TR and in Dominica in the Lesser Antilles, presenting a much more complex evolutionary history.While phylogenomic analyses recovered the monophyly of all localities but FN (Fig. 1b), concatenated phylogeny of mtDNA + 28SrRNA only recovered main clades, which includes one well-supported clade from SSV, one from CDM, and one from the OIs (Fig. 1a).Although the consensus separation of the OIs from CDM, haplotype networks of 16SrRNA showed shared haplotypes between the two localities.Moreover, the structure analyses indicated a low level of admixture of the CDM genetic cluster in the FN population (Fig. 3c).This relationship between CDM and FN is corroborated by the historical migration analyses assessed with Fastsimcoal, which suggests a migration pathway from CDM to FN (Fig. 3d), and that could also be a sign of the ancient colonization route from the Caribbean to the OIs.Connectivity between the Caribbean and EI is also observed for the sally lightfoot crab Grapsus grapsus, a species restricted to the Caribbean and OIs in the Atlantic Ocean 22 .
Strong connectivity was observed among the OIs.Therefore, regardless of the distance between TR and EI, a robust gene flow is still observed across the southwestern Atlantic, which maintains the OI low genetic structure.The connectivity between TR and EI is also observed for the coral M. hispida 22 ; however, these results strongly differed from other insular species that co-occur in the OIs, such as the crabs G. grapsus and Johngarthia lagostoma, that showed connectivity among the EIs and high population structure in TR island 20,31 .Thus, despite the closer phylogenetic relationship with those crabs, the population structure of the 'islands' amphipods agrees more with sessile invertebrates.
Dominican populations also showed an interesting pattern.The localities of the Caribbean coast (SB and SOU) displayed high connectivity, while the Atlantic coast (SSV) remained isolated.Dominica is a small island of 750 km 2 ; therefore, we did not expect a high differentiation between Atlantic (SSV) and Caribbean (CDM) populations, which are closer than the Brazilian OIs.Indeed, the Structure analysis of the 'islands' separated into three genetic clusters, OI, CDM and SSV, in which CDM has a light contribution to the OIs' population.PCA also showed the separation of the same three clusters, with increased differentiation of the CDM from SSV in the second axis (Fig. 3b).
SSV divergence was corroborated by most of the analyses, and demographic analyses suggest a historical expansion of SSV around 373,749 generations ago (Fig. 4), with migration to the OIs and CDM, in agreement with the species originating in the Caribbean region and later dispersing to the southwestern Atlantic.The particularities of the SSV population are better observed with the environmental association provided by the RDA, where SSV specimens presented an association with minimum dissolved oxygen concentrations (Fig. 6).It is plausible that the locality is subjected to more environmental changes than the other sampled localities within the 'islands' since pelagic Sargassum patches are regularly washed to the region during boreal summer/autumn seasons (Supplementary Fig. S20).Gases (e.g., H 2 S, NH 3 , CO 2 , CH 4 ), organic matter, and high biological oxygen demand areas are formed by the biological degradation of brown algae in near-shore environments, giving rise to anaerobic zones 32,33 .In this sense, SSV is pivotal to understanding historical events and current processes that might shape future populations.
Genetic results of Ampithoe marcuzzii s.l.suggest that populations are geographically structured and, therefore, spatial connectivity, possibly by floating Sargassum dynamics, did not play a prominent role in the species' complex current pattern.Since A. marcuzzii is not commonly observed in floating Sargassum, regular Sargassum patches' movements might not be enough to maintain genetic connectivity between populations on the continent.Instead, stochastic events such as seaquakes would possibly be needed to connect benthic macroalgae to the pelagic Sargassum.In this sense, island populations are more likely to benefit from this physical connectivity.Changes in floating Sargassum dynamics are yet to be understood and further studied, and future assessments of A. marcuzzii populations could help track the genomic changes caused by the environment.
In conclusion, marine invertebrates with limited dispersion capacities, particularly those associated with macroalgae and other benthic substrata, may exhibit similar patterns of cryptic species complexes due to isolation and abiotic variation.Based on the findings from A. marcuzzii, populations on the continent seem to be linked to a more dynamic range of abiotic variables and hence might be more susceptible to local climate changes, unless the continent MOTU evolves quickly to local environments, and might be less susceptible to

DNA extractions, PCR amplification and sequencing
Genomic DNA was extracted according to the CTAB protocol of 36 .Partial sequences of the mitochondrial genes cytochrome oxidase subunit 1 (COI) and 16SrRNA and nuclear 28SrRNA were obtained for subsequent analyses.Fragments were amplified using the following primer pairs: HCO2198 and LCO1490 37 for COI, 16STf 38 and 16Sbr 39 for 16SrRNA and rd1 and rd4b 40 for 28SrRNA.
Polymerase chain reactions (PCRs) were carried out in a total volume of 15μL containing 1 × PCR buffer, 2.5 U Taq DNA Polymerase, 1.5 mM MgCl 2 , 200 μM of each dNTP, 0.35 μM of each oligonucleotide and ultrapure water to achieve the reaction volume.The cycling profile on the thermal cycler involved a denaturation step at 94 °C for 1 min, 35 cycles of denaturation at 94 °C for 1 min; annealing temperature for 1 min, extension at 72 °C for 1 min; and extension at 72 °C for 10 min.Annealing temperatures were 48 °C for COI and 28SrRNA and 52 °C for 16SrRNA.PCR products were purified with polyethylene glycol solution 15% (PEG) and then amplified for sequencing using a BigDye Terminator Cycle Sequencing Kit v.3.1 (Applied Biosystems, Waltham, Massachusetts, USA) at the Human Genome and Stem Cell Research Center, Universidade de São Paulo, São Paulo, Brazil.
Chromatograms were analyzed using Genestudio 2.2 (Genestudio Inc.), which was used to perform BLAST searches (http:// blast.ncbi.nlm.nih.gov) to check for contamination or sequencing errors.Individual consensus sequences were aligned using the online version of MAFFT (ver.7,see https:// mafft.cbrc.jp/ align ment/), with the Q-INS-i iterative refinement methods for 16SrRNA and 28SrRNA.Next, using TrimAl v.1.2 41with the gappyout option, we removed regions of questionable homology.Each marker was analyzed separately for most population analyses and concatenated for phylogenetic analyses.

Library preparation, sequencing and SNP filtering
Individual libraries were generated based on the genotyping-by-sequencing (GBS) method described by 42 and modified by 43 .Genomic DNA was digested with the PstI restriction enzyme (5′-CTG CAG -3′) and ligated to barcode and common adaptors with appropriate sticky ends.The products were pooled into sets of up to 60 samples and then PCR-amplified using generic primers matching the common adaptors under the following thermocycling conditions: 5 min at 72 °C, 30 s at 98 °C, 18 cycles of 10 s at 98 °C, 30 s at 65 °C and 30 s at 72 °C, and an extension step of 5 min at 72 °C.The presence of spare adapters and the sizes of the DNA fragments were assessed by quantification on an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, California, USA) with the Agilent DNA 1000 kit and by quantitative PCR using a Light Cycler 480II (Roche, Basel, Switzerland) with the Kappa Biosystems Kit (Wilmington, Massachusetts, USA) for library quantification.Libraries were constructed by EcoMol Consultoria (Piracicaba, São Paulo, Brazil), and sequencing was performed at the Center for Functional Genomics Applied to Agriculture and Agroenergy (LZT/ESALQ/USP, Brazil) in 100 bp single-end fragments using a HiSeq 2500 platform (Illumina Inc., San Diego, California, USA) in three lanes.
SeqyClean pipeline 44 was used to filter out sequences < 50 bp and to remove sequences with an average Phred quality score ≤ 20, adapters and contaminants from the UniVec database (NCBI, ftp:// ftp.ncbi.nlm.nih.gov/ pub/ UniVec/).The ipyrad v.0.7.28 program 45 was used to assign reads to individual samples and to edit and cluster reads into consensus sequences through paralogue identification.In ipyrad, reads with more than five Ns or shorter than 35 bp in length were discarded.The minimum read depth was set to six for calling consensus sequences within the samples, and the maximum depth was set to 10,000.The clustering threshold was set to 90%, and the maximum number of single-nucleotide polymorphisms (SNPs) per locus was set to 30.A locus had to be present in at least 50% of the samples to be retained in the final dataset.All the remaining parameters were set as default.The density of missing data per locus was observed through matrix occupancy 46 and PLINK 47 removed SNPs with a minimum allele frequency of < 1%, missing genotypes higher than 50% and linkage disequilibrium (r 2 = 0.50).

Phylogenetic analyses and divergence time estimate mtDNA and 28SrRNA
For comparative phylogenetic analyses, COI fragments, 16SrRNA and 28SrRNA fragments from Ampithoe marcuzzii s.l. were concatenated.Sequences of Ampithoe ramondi Audouin, 1826, from Brazil, were used for rooting (GenBank accession nos: OP498032, OP494671, OP494690).The best partition schemes and models were determined in ModelFinder 1.5.4 48based on the modified Akaike Information Criterion (AICc) available on IQ-TREE 2 web server 49 .The optimal partitioning strategy and evolutionary models consisted of subdividing the COI data set by codon positions with the model TN + F, Tne and F81 + F for each codon position, TPM3 + F + I for 16SrRNA and K2P + G4 for 28SrRNA.The phylogenetic analyses of maximum likelihood (ML) were held at the W-IQ-TREE web server 50 , and the support of the nodes was evaluated with 1000 ultrafast bootstrap replicates 51 .For the species confirmation, we performed the Automatic Barcode Gap Discovery (ABGD 52 ) and the multi-rate Poisson Tree Process (mPTP 53 ) on the COI dataset.Our alignments were processed in the ABGD web (https:// bioin fo.mnhn.fr/ abi/ public/ abgd/ abgdw eb.html) using the Kimura two-parameter substitution model prior for a maximum value of intraspecific divergence between 0.001 and 0.100, 15 recursive steps, and a gap width of 1.0.For the mPTP method, the species were inferred on ML trees in the program's web interface (http:// mptp.h-its.org).mPTP improves upon the Poisson Tree Processes for single-locus species delimitation, where instead of all species sharing the same evolution rate (λ), as in PTP, each species branch has its own λ in the mPTP model.

Figure 2 .
Figure 2. Minimum spanning network of 28SrRNA above, COI in the middle, and 16SrRNA below.Each node represents a haplotype, its size corresponds to the haplotype frequency and its colours indicate the locality.Black nodes represent inferred ancestral nodes.Numbers on branches represent mutational steps.

Figure 3 .
Figure 3. (a) PCA plots for 'continent' and 'islands' Ampithoe populations showing the first two axis for SNPs dataset.Dots represent individuals, with colours denoting the localities.(b) Structure results from A. marcuzzii s.l.dataset (k = 2).Each vertical bar corresponds to one individual, and the colour indicates the proportion of membership in each genetic cluster.(c) Main migration routes inferred from Fastsimcoal for SNPs.Arrows width correspond to comparative amount of migrants (see figures S15 and S16 for more information on migration rate).(d,e) Structure results from 'continent' (k = 2) and 'islands' (k = 3).Small maps show 'continent' and 'islands' geographic structure.ABR Abrolhos Archipelago, BOI Boipeba Island, CF Cabo Frio, FOR Forte Beach, NAT Natal, SINO Ilhabela, SPD Ubatuba, FN Fernando de Noronha, RA Rocas Atoll, TR Trindade Island, CDM Caribbean Dominica, SSV San Sauveur.

Figure 4 .
Figure 4. Comparisons among demographic models of population expansion for 'continent' (E, NE and SE), and 'islands' (OI, CDM and SSV) for SNPs dataset.Scenarios with lower AIC are highlighted in red shadow.E eastern Brazil, NE northestern Brazil, SE southeastern Brazil, CDM Caribbean Dominica, OI Brazilian oceanic islands (Rocas Atoll, Fernando de Noronha and Trindade), SSV San Sauveur (Dominica).

Figure 5 .
Figure 5. Modelled distributions of 'continent' and 'islands' for the Last Glacial Maximum (21 kya) and present (0 kya) scenarios.The suitability color range predicts how adequate the environment is for the species occurrence.